Flock: Crawling the Twitter Social Graph

نویسندگان

Brian Hudson

Supraja Gurajala

Wenjin Hu

چکیده

In this work, we’ve developed a tool capable of crawling the Twitter social graph. In order to collect the social graph, we developed a Twitter application leveraging 30 user accounts and the Twitter REST API v1.1. To date, Flock has collected the profiles of over 48 million Twitter users connected by over 158 million links. Flock continues to actively crawl over 57 million valid Twitter user IDs that is has so far discovered. We store the collected data into an open source NoSQL database (MongoDB) for further social network analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Twitter Data Collection: Crawling Users, Neighbors and Their Communication for Personal Attribute Prediction in Social Media

متن کامل

Adaptive Identification of Hashtags for Real-Time Event Data Collection

the widespread use of Microblogging services, such as Twitter, makes them a valuable tool to correlate people’s personal opinions about popular public events. Researchers have capitalized on such tools to detect and monitor real world events based upon this public, social, perspective. Most Twitter event analysis approaches rely on events tweets collected through a set of pre-defined keywords. ...

متن کامل

Community Detection on Evolving Graphs

Clustering is a fundamental step in many information-retrieval and data-mining applications. Detecting clusters in graphs is also a key tool for finding the community structure in social and behavioral networks. In many of these applications, the input graph evolves over time in a continual and decentralized manner, and, to maintain a good clustering, the clustering algorithm needs to repeatedl...

متن کامل

A Faceted Crawler for the Twitter Service

Researchers, nowadays, have at their disposal valuable data from social networking applications, of which Twitter and Facebook are the most prominent examples. To retrieve this content, the Twitter service provides 2 distinct Application Programming Interfaces (APIs): a probe-based and a streaming one, each of which imposes different limitations on the data collection process. In this paper, we...

متن کامل

Loklak - A Distributed Crawler and Data Harvester for Overcoming Rate Limits

Modern social networks have become sources for vast quantities of data. Having access to such big data can be very useful for various researchers and data scientists. In this paper we describe Loklak, an open source distributed peer to peer crawler and scraper for supporting such research on platforms like Twitter, Weibo and other social networks. Social networks such as Twitter and Weibo pose ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Flock: Crawling the Twitter Social Graph

نویسندگان

چکیده

منابع مشابه

Twitter Data Collection: Crawling Users, Neighbors and Their Communication for Personal Attribute Prediction in Social Media

Adaptive Identification of Hashtags for Real-Time Event Data Collection

Community Detection on Evolving Graphs

A Faceted Crawler for the Twitter Service

Loklak - A Distributed Crawler and Data Harvester for Overcoming Rate Limits

عنوان ژورنال:

اشتراک گذاری